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Deficits in numerical magnitude perception characterize the mathematics learning disability developmental 
dyscalculia (DD), but recent studies suggest the relation stems from inhibitory control demands from incon- 
gruent visual cues in the nonsymbolic number comparison task. This study investigated the relation among 
magnitude perception during differing congruency conditions, executive function, and mathematics achieve- 
ment measured longitudinally in children (n = 448) from ages 4 to 13. This relation was investigated across 
achievement groups and as it related to mathematics across the full range of achievement. Only performance 
on incongruent trials related to achievement. Findings indicate that executive function in a numerical context, 
beyond magnitude perception or executive function in a non-numerical context, relates to DD and mathemat- 


ics across a wide range of achievement. 


Mathematical thinking pervades nearly all aspects 
of modern life, from personal accounting to under- 
standing important information about one’s health. 
Accordingly, individuals with poor mathematical 
skills are less likely to graduate high school, go to 
college, have steady employment (Bynner & Par- 
sons, 2006; Rivera-Batiz, 1992), and are at a higher 
physical and mental health risk (Bynner & Parsons, 
2006; Duncan et al., 2007; Hibbard et al., 2007). The 
development of mathematical skills can be affected 
by a range of factors including education, home 
environment, and reading ability. However, a sub- 
stantial body of research indicates that individual 
differences in the cognitive system used to perceive 
and manipulate numerical magnitudes, often 
labeled the approximate number system (ANS; 
Feigenson, Dehaene, & Spelke, 2004), play a foun- 
dational role in mathematics development (Chen & 
Li, 2014; Schneider et al., 2017; Schwenk et al., 
2017). Furthermore, an estimated 3%-6% of the 
population is affected by the specific mathematics 
learning disability developmental dyscalculia (DD; 
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Shalev, Auerbach, Manor, & Gross-Tsur, 2000; 
Szucs & Goswami, 2013). Individuals with DD dis- 
play difficulties with fundamental aspects of 
numerical processing from very early ages and con- 
tinue to struggle with math, even when given the 
same schooling opportunities as their peers. How- 
ever, the nature of these numerical deficits and their 
relation to the abilities of typically developing (TD) 
populations remains poorly understood. 


The ANS, Mathematics Achievement, and Dyscalculia 


The most commonly used behavioral measure of 
ANS function is the nonsymbolic number compar- 
ison task. In this task, participants judge which of 
two groups of objects, such as dots or squares, is 
more numerous. Higher accuracy rates and faster 
response times are thought to indicate higher acuity 
and enhanced efficiency of the ANS (Inglis & Gil- 
more, 2014). There is considerable support for a 
relation between efficiency of the ANS and mathe- 
matics achievement, both as a marker for DD (for 
reviews, see Iuculano, 2016; Szkudlarek & Brannon, 
2017) and across the full range of mathematics 
achievement (for meta-analyses, see Chen & Li, 
2014; Schneider et al., 2017). 

Accordingly, the dominant theory regarding a 
core deficit in DD proposes an impairment of the 
ANS, in part because individuals with DD have 
been shown to perform more poorly in tasks 
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designed to measure the ANS, such as the nonsym- 
bolic number comparison task (Mazzocco, Feigen- 
son, & Halberda, 2011; Mejias, Mussolin, Rousselle, 
Grégoire, & Noel, 2012). Furthermore, neuroimag- 
ing research suggests that individuals with DD 
have atypical structure and function of proposed 
neural substrates of the ANS, such as the intrapari- 
etal sulcus (Ashkenazi, Black, Abrams, Hoeft, & 
Menon, 2013; Dinkel, Willmes, Krinzinger, Konrad, 
& Koten, 2013; Kaufmann et al., 2009; Mussolin 
et al., 2010; Price, Holloway, Rasanen, Vesterinen, 
& Ansari, 2007; Rosenberg-Lee et al., 2015; Rotzer 
et al. 2008; Rykhlevskaia, Uddin, Kondos, & 
Menon, 2009). Given this evidence, many research- 
ers suggest that deficits in symbolic number pro- 
cessing, arithmetic fluency, and higher order 
mathematical thinking stem from a core deficit in 
the ANS (Butterworth, Varma, & Laurillard, 2011; 
Iuculano, Tang, Hall, & Butterworth, 2008; Wilson 
& Dehaene, 2007). 

Although there is some consensus that the ANS 
is atypical in individuals with DD, there is much 
disagreement as to the true mechanistic nature of 
this deficit (Sztics & Goswami, 2013), its causal role 
in DD (Mazzocco & Rasanen, 2013), and whether 
the deficit is isolated to the ANS or may be con- 
comitant with deficits in symbolic representation of 
number or issues related to executive functions 
(Fias, Menon, & Szucs, 2013; Rousselle & Noel, 
2007; Sztics, Devine, Soltesz, Nobes, & Gabriel, 
2013). It should be further stated that the develop- 
mental relation between the ANS and the acquisi- 
tion of symbolic number faculty is both important 
and not well understood. It is important in that 
mathematics is inherently symbolic, and further, 
most symbolic number tasks have a significantly 
stronger relation to math achievement than non- 
symbolic tasks (De Smedt, Noel, Gilmore, & Ansari, 
2013; Fazio, Bailey, Thompson, & Siegler, 2014; 
Geary et al., 2018; Holloway & Ansari, 2009; Sch- 
neider et al., 2017). Therefore, the importance of the 
ANS for math development may depend on its 
relation to the acquisition of symbolic number 
(Reynvoet & Sasanguie, 2016; vanMarle et al., 2018) 
or their continued relation throughout development 
(Leibovich & Ansari, 2016), but remains a matter of 
considerable debate. 

Adding to this complication, individual differ- 
ences in ANS acuity consistently correlate with 
mathematics across the full range of achievement 
(Halberda, Mazzocco, & Feigenson, 2008; Keller & 
Libertus, 2015; Schneider et al., 2017), suggesting 
the relation is not isolated to group differences that 
identify severe mathematics deficits but rather 


extends broadly across achievement levels. As a 
result, it remains unclear whether DD represents a 
qualitatively distinct subgroup with distinct cogni- 
tive deficits or is the lowest extreme of a continuous 
distribution. This distinction is important for devel- 
oping appropriate intervention strategies to remedi- 
ate low mathematics skills (Butterworth & Kovas, 
2013; Henik, Rubinsten, & Ashkenazi, 2011). For 
example, if individuals with DD are identified as 
suffering from a specific impairment of magnitude 
processing that is qualitatively distinct in its mecha- 
nistic origin from their TD peers, it would suggest 
that remediation should target the training of this 
uniquely impaired mechanism. 


Nonsymbolic Number Comparison as a Measure of the 
ANS? 


One problem undermining the link between 
ANS function and mathematics development is the 
reliance on nonsymbolic number comparison as a 
measure of ANS acuity. Conventionally, nonsym- 
bolic number comparison performance has been 
interpreted as a measure of ANS function (De 
Smedt et al., 2013). However, recent research sug- 
gests that the task may be measuring more than 
ANS acuity alone. Specifically, several studies have 
shown that nonsymbolic number comparison is 
highly influenced by the visual parameters of task 
stimuli (Gebuis & Reynvoet, 2011, 2012; Leibovich 
& Henik, 2013; Sztics, Nobes, et al., 2013). For 
example, Sztics, Devine, et al. (2013) and Sztcs, 
Nobes, et al. (2013) showed that congruency effects 
have a large impact on the ratio-based internal 
Weber fraction, or w, a common metric of measur- 
ing ANS acuity. Furthermore, the impact was even 
greater for children than in adults, leading them to 
suggest the visual parameter confound could also 
be complicated by an interaction with development. 
In general, visual properties such as surface area 
and object size covary with numerosity. If these 
properties are not controlled when creating stimuli, 
participants can rely on non-numerical cues to 
select the more numerous array. Thus, to ensure 
participants employ a strategy focused on numeros- 
ity, stimuli are designed such that, in some trials, 
the more numerous dot set has a greater surface 
area or dot size (congruent trials), and in other tri- 
als a lesser surface area or dot size (incongruent tri- 
als; e.g., Dehaene, Izard, & Piazza, 2005). 

Recent studies suggest that performance on 
incongruent trials may drive the relation between 
nonsymbolic number comparison and mathematics 
performance (Bugden & Ansari, 2016; Clayton, 


Gilmore, & Inglis, 2015; Cragg, Keeble, Richardson, 
Roome, & Gilmore, 2017; Fuhs & McNeil, 2013; Gil- 
more etal., 2013; Keller & Libertus, 2015). For 
example, in a study comparing nonsymbolic num- 
ber comparison performance in children with DD 
versus TD peers, Bugden and Ansari (2016) found 
that children with DD only differed on incongruent 
trials. A follow-up analysis showed that children’s 
visuospatial working memory predicted ANS acu- 
ity on incongruent trials, indicating that visuospa- 
tial working memory may be an _ important 
cognitive process utilized for extraction of numeros- 
ity in the presence of other visually salient informa- 
tion. Similarly, studies by Gilmore et al. (2013) and 
Fuhs and McNeil (2013) found that only perfor- 
mance on incongruent trials of the nonsymbolic 
number comparison task was related to mathemat- 
ics performance across a wide range of mathematics 
achievement in primary school and preschoolers, 
respectively. To explain this specific relation, the 
authors of those studies suggest that incongruent, 
non-numerical visual cues in the comparison task 
require participants to inhibit their visually based 
response before making a quantity-based judgment, 
thus engaging inhibitory control mechanisms. 
Accordingly, both Gilmore et al. and Fuchs and 
McNeil posit that inhibitory control and selective 
attention demands of incongruent trials, rather than 
ANS acuity, drive the relation between nonsym- 
bolic comparison performance and mathematics. 
Indeed, after controlling for inhibitory control, the 
relation between mathematics performance and 
nonsymbolic comparison was no longer statistically 
significant in both studies. 


The ANS and Executive Function 


Still, the contribution of executive function to the 
relation between nonsymbolic number comparison 
and mathematics performance remains unclear. In 
contrast to Gilmore et al. (2013) and Fuhs and 
McNeil (2013), both Keller and Libertus (2015) and 
Gilmore, Keeble, Richardson, and Cragg (2015) 
found that the relation between accuracy in the 
number comparison task and mathematics persisted 
when controlling for inhibitory control, which sug- 
gests the relation between number comparison per- 
formance and mathematics is not fully accounted 
for by domain-general inhibitory control. Starr, 
DeWind, and Brannon (2017) compared the relation 
between mathematics achievement and the influ- 
ence of numerical acuity as distinct from the influ- 
ence of non-numerical visual parameters on 
nonsymbolic number comparison performance 
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while also measuring inhibitory control in a non- 
numerical task (i.e., day/night in children and flan- 
ker in adults) in a 4- and 6-year-old sample and a 
sample of adults. Their results indicated that 
numerical acuity correlated with higher math scores 
in the 6-year-old sample, whereas non-numerical 
bias and inhibitory control did not, which, in agree- 
ment with the two previous studies, suggests that 
numerical discrimination relates to mathematics 
achievement. However, Starr et al.’s measure of 
non-numerical bias is a regression term that 
accounts for the influence of visual parameters on 
participants’ behavior, which is somewhat distinct 
from performance on trials where visual informa- 
tion is incongruent with numerosity and would 
more directly address the notion of a number-speci- 
fic executive function. Furthermore, it should be 
noted that all five of these studies focused on inhi- 
bitory control in a TD sample, whereas Bugden and 
Ansari’s (2016) findings related performance on 
incongruent trials of the nonsymbolic comparison 
task to group differences between DD and TD chil- 
dren. In addition to the group differences versus 
individual differences distinction between studies, 
Bugden et al. investigated the role of visuospatial 
working memory as opposed to inhibitory control. 

Although dominant models indicate that execu- 
tive function can be divided into the broad cate- 
gories of working memory/updating, inhibitory 
control, and attention shifting (Bull & Scerif, 2001; 
Miyake et al., 2000), most prior studies on nonsym- 
bolic comparison and mathematics achievement 
have controlled for only one aspect of executive 
function, either working memory or inhibitory con- 
trol. As a result, the more fine-grained mechanistic 
relations between executive function deficits and 
ANS deficits have been difficult to determine. To 
address these issues, this study focuses on two out- 
standing questions regarding the relation among 
the ANS, executive function, and mathematics 
achievement in typically and atypically developing 
individuals in order to provide more information 
about the specific mechanisms at play during non- 
symbolic number comparison. 

First, what are the mechanisms underlying the 
relation between performance on incongruent trials 
of the nonsymbolic comparison task and mathemat- 
ics achievement as compared to congruent trials? 
Previous studies have framed the correlation 
between nonsymbolic comparison performance and 
mathematics achievement as attributable to either 
individual differences in the ANS or executive func- 
tion. An additional possibility is that incongruent 
trials on the nonsymbolic number comparison task 
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require an interaction of executive function and the 
ANS, or in other words, a number-specific executive 
function. Rather than the relation between number 
comparison performance and math achievement 
depending on neurocognitive mechanisms associ- 
ated with numerical magnitude processing or exec- 
utive function independently, a deficit could 
originate from the biological interplay of these two 
mechanisms. Successfully answering an incongruent 
trial requires selective attention to the discrete 
quantity of each dot set while ignoring other sali- 
ent, yet irrelevant, stimulus dimensions. Consistent 
with this suggestion, experimental studies have 
demonstrated a distinction between executive func- 
tion related to numerical and non-numerical con- 
tent. In a study of DD adults, individuals with DD 
had difficulty recruiting attention to numerical 
information but not non-numerical information 
under heightened cognitive load (Ashkenazi, Rubin- 
sten, & Henik, 2009). In children, Bull and Scerif 
(2001) demonstrated that inhibitory control and 
working memory of numerical information 
accounts for significant variance in individual dif- 
ferences in mathematics ability beyond similar, 
non-numerical measures of executive function. 
Therefore, to appropriately account for the possibil- 
ity of an interaction between executive function and 
the ANS, executive function must be measured in 
both non-numerical and numerical contexts. 

Second is the relation among executive function, 
nonsymbolic number comparison, and mathematics 
achievement a specific facet of atypical develop- 
ment, comprising a characteristic of DD that sets 
the disorder qualitatively apart from typical devel- 
opmental trajectories, or is the relation a character- 
istic of a broad range of typical mathematics skill 
development? Previous research appears to suggest 
that measurements of the ANS correlate with math- 
ematics across the full range of mathematics 
achievement (Schneider et al., 2017). At the same 
time, studies suggest that the ANS of individuals 
with DD is neurobiologically atypical and functions 
differently than that of their TD peers (Mazzocco 
et al., 2011; Price etal., 2007). Distinguishing 
between these alternatives may provide meaningful 
implications for intervention strategies. 


This Study 


To address the questions above, this study inves- 
tigates the relations among ANS function, executive 
function, and DD by examining performance on the 
nonsymbolic comparison task, separately for con- 
gruent and incongruent trials while controlling for 


multiple aspects of executive function. Importantly, 
executive function here is measured in a non- 
numerical context. To build directly on previous 
work, we take a similar approach as Mazzocco 
et al. (2011). We first compare performance in the 
nonsymbolic comparison task across multiple math- 
ematics achievement groups (DD, low achieving 
[LA], and typically achieving [TA]) defined through 
multiple years of consistent achievement, including 
the first 3 years of school entry. Second, we con- 
sider the relation between performance on the non- 
symbolic comparison task and mathematics 
achievement more broadly through a regression 
analysis with a large sample that includes the full 
range of mathematics achievement. In the first anal- 
ysis, if DD is characterized by a distinct core deficit 
of the ANS, performance on both congruent and 
incongruent trials of the task should distinguish 
among achievement groups. If, on the other hand, 
DD is characterized by deficits specific to executive 
function, performance on only the incongruent tri- 
als of the nonsymbolic comparison task should 
account for achievement group differences but not 
after controlling for measures of non-numerical 
executive function. However, if impaired number- 
specific executive function underlies DD, we would 
expect group differences between the DD group 
and the other achievement groups on incongruent 
trials, but not congruent trials, after controlling for 
non-numerical, domain-general executive function- 
ing. Similarly, in the second analysis, if number- 
specific executive function is related to individual 
differences in mathematics achievement across a 
wide range of achievement, not only a distinction 
between DD and the other achievement groups, 
performance on incongruent trials should predict 
mathematics achievement beyond what can be 
accounted for by congruent trials and multiple com- 
ponents of non-numerical executive function. 


Method 
Participants 


The current sample was drawn from a study of 
students who participated in an earlier longitudinal 
study of early mathematical skills (Pre-K to first 
grade; Hofer, Lipsey, Dong, & Farran, 2013). The 
analytic sample for the original study included 771 
children. In the follow-up study, we were able to 
locate 628 students attending public school in the 
2013-2014 year in the same district as they attended 
in Pre-K (16 had withdrawn from the study in first 
grade and were not contacted for further 


participation, 29 had moved out of the state, 53 had 
moved out of the district, and 45 were not located 
despite all efforts). Of those 628, we obtained par- 
ental consent and assessed 517 children in the 
2013-2014 school year, 506 children in the 2014— 
2015 school year, and 503 children in the 2015-2016 
school year. 497 children were assessed at all three 
time points in middle school. English language 
learners (1 = 43) were excluded because non-native 
language of mathematics instruction could lead to 
low mathematics achievement for reasons other 
than the cognitive factors investigated in this study. 

Our final sample comprised 448 students for 
whom we had measures of mathematics achieve- 
ment from two of the three early time points 
(spring of preschool, kindergarten, and first grade) 
and from two of the three later time points (fifth, 
sixth, and seventh grades), reading achievement 
measured at the end of kindergarten, inhibitory 
control and task switching measured at sixth or 
seventh grade, and working memory measured at 
fifth or sixth grade. This represents a loss of 26 stu- 
dents due to missing data for any of these measures 
from the full middle school follow-up sample 
(n = 517), or 5.0%, and only complete cases given 
the above criteria are analyzed. Methods for resolv- 
ing differences in measurement year are described 
below in the description of each measure. 

The final sample was 56.5% female, 9.6% White, 
87.1% Black, 0.7% Hispanic, 1.1% Middle Eastern, 
0.2% Asian or Pacific Islander, and 1.3% other races 
(no further distinction of race available). Of the 448 
students who should have been in sixth grade in 
the 2014-2015 school year if they had not been 
retained or promoted early, 78 (17.4%) were still in 
fifth grade and 1 (0.2%) had been promoted to sev- 
enth grade. Students were located in 76 schools in 
the first year of the follow-up study (fifth grade), 
including 31 elementary schools, 27 middle schools, 
11 charter schools, and 7 Innovation Cluster schools 
(i.e., schools that had been targeted for additional 
resources to boost achievement). Family income 
level was inferred on the basis of whether partici- 
pants qualified for free or reduced lunches (ie., 
family income < 1.85 times the U.S. federal income 
poverty guideline). In the current sample, 88.6% of 
participants qualified for free and reduced lunch, 
10.3% did not, and 1.1% individuals were missing 
economic status data. Pre-K through first grade and 
fifth through seventh-grade waves of data collection 
were used to define mathematics achievement 
groups. Nonsymbolic comparison performance was 
utilized from sixth grade because concurrent mea- 
sures of working memory and executive function 
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were available for children in that year. Mean age 
at the end of pre-K, the first data point, was 
5.1 years (SD = 0.3, range = 4.5-6.4). See Table S1 
for full descriptive statistics. 


Achievement Groups 


Individuals were placed in achievement groups if 
their mathematics achievement scores were consis- 
tently in the designated achievement range at two of 
the three early assessments (pre-K-first grade) AND 
two of the three later assessments (fifth-seventh 
grades). Given these criteria, 222 children fit into con- 
sistent achievement groups across early and later 
assessment periods, thus excluding 226 children, 
respectively, from the full sample of 448 whose 
achievement level varied beyond the defined thresh- 
old across time points. Descriptive statistics for the 
achievement group sample (nm = 222) are broken 
down by achievement group in Table 1. 

Our first set of analyses asked whether perfor- 
mance on congruent or incongruent trials of non- 
symbolic number comparison distinguished children 
with DD from their LA and TA peers. One com- 
monly used threshold for defining DD is perfor- 
mance in the lowest 10th percentile of standardized 
mathematics achievement tests (Dinkel et al., 2013; 
Mazzocco et al., 2011). Several studies comparing 
groups of student achieving in the lowest 10th per- 
centile to those in the 11th—25th percentiles reveal 
important qualitative differences in cognitive profiles 
(Geary, Hoard, Byrd-craven, & Nugent, 2007; Maz- 
zocco & Myers, 2003), notably indicating that the 
lowest achievement group had an impairment in 
nonsymbolic magnitude processing compared to all 
other achievement groups (Mazzocco et al., 2011). 
Therefore, in this study, we assigned participants to 
three different mathematics achievement groups, 
DD individuals (< 10th percentile), LA individuals 
(10th—25th percentile), and TA individuals (25th— 
95th percentile). With these grouping criteria, 22 chil- 
dren met the criteria for DD, 12 for LA, and 188 for 
TA. Only one individual consistently scored > 95th 
percentile, a commonly used criterion for school 
placement in gifted and talented programs, and a 
common threshold for designating high achieving 
groups in research (e.g., Hoard, Geary, Byrd-craven, 
& Nugent, 2008; Mazzocco et al., 2011). This individ- 
ual was removed from further analysis. 

There is a great diversity in definitions and cut- 
off thresholds for defining DD in prior literature, 
and accordingly, findings may not hold across dif- 
ferent criteria for selecting DD groups. To address 
this heterogeneity in the literature, group 
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Table 1 
Descriptive Statistics for Achievement Subgroups 


DD (n = 22, 7 female) 


Achievement group sample M SD Range 

Age (years), pre-K 5.1 0.5 4.5 — 6.4 

Age (years), sixth grade 12.2 0.5 11.4 - 13.4 

Nonsymbolic comparison 71.5 5.3 62.9 — 81.4 
(accuracy, %) 

Nonsymbolic comparison 78.7 9.1 63.6 — 90.9 
(congruent accuracy, %) 

Nonsymbolic comparison 53.5 13.3 27.8 — 83.3 
(incongruent accuracy, %) 

Nonsymbolic comparison 0.37 0.11 0.21 — 0.65 
(Weber fraction, w) 

Backward Corsi (z-score of —1.21 1.22 —2.4 — 0.95 
max span)* 

Hearts and flowers (z-score of —1.29 0.79 —2.33 — 0.82 
accuracy, %)* 

Letter-word identification 91.4 9.90 73-113 


(WCJ-IIL, standard score) 


LA (n = 12, 6 female) TA (n = 188, 106 female) 


M SD Range M SD Range 
5.0 0.3 4.7 — 5.5 5.1 0.3 4.5 — 5.6 
12.0 0.3 11.6 - 12.5 12.0 0.3 11.4 - 12.6 
78.2 6.4 70.0 — 87.1 75.8 5.0 58.6 —- 91.4 
76.9 10.7 54.5 — 86.4 76.3 11.1 40.9 — 95.5 
70.8 21.3 33.3 — 94.4 65.7 14.0 33.3 — 94.4 
0.26 0.10 0.13 — 0.48 0.27 0.07 0.10 — 0.56 
0.03 0.57 —0.75 — 0.95 0.37 0.85 —2.44 — 2.65 
—0.16 0.83 —1.90 — 1.83 0.40 0.83 —1.90 - 1.83 
97.4 11.9 73 — 113 115.1 11.9 85 — 144 


Note. WCJ-III = Woodcock Johnson—III; DD = developmental dyscalculia; LA = low achieving; TA = typically achieving. 


“z-scores presented based on full sample of 448 individuals. 


comparison analyses in this study were replicated 
with another commonly used threshold for deter- 
mining DD (achievement < 1.5 SD below the popu- 
lation mean) and included in Appendix A. Using 
this alternative threshold did not alter the results, 
suggesting the results are not a product of the cho- 
sen threshold. 

Many previous studies have attempted to isolate 
the neurocognitive mechanisms of DD by studying 
a group of individuals with DD compared to a con- 
trol group matched on IQ and other cognitive abili- 
ties (Landerl, Bevan, & Butterworth, 2004; Mussolin 
et al., 2010; Rotzer et al., 2008). This study does not 
take this approach for two reasons. First, research 
suggests that defining learning disability groups 
through discrepancy criteria excludes many individ- 
uals with dyscalculia who suffer from comorbid 
learning disabilities or other developmental issues. 
Most estimates suggest that 20%—40% of individuals 
with DD also have dyslexia (Shalev, 2004; Willcutt 
et al., 2013; Wilson et al., 2015), and around 25% 
also have attention deficits (Landerl, Gobel, & Moll, 
2013; Shalev, 2004; Shalev, Auerbach, & Gross-Tsur, 
1995). This suggests that DD is inherently heteroge- 
neous and would better be characterized by a 
framework whereby individuals are designated as 
DD through proof of consistent, low mathematics 
achievement over time with the presence of ade- 
quate educational opportunity (Fuchs, Morgan, 


Young, & Rise, 2003). Therefore, rather than 
exclude nondiscrepant individuals, this study fol- 
lows previous literature (Mazzocco et al., 2011) and 
investigates differences in the ANS while control- 
ling for reading achievement and domain-general 
executive function. Second, this study examines the 
intersection of attention mechanisms and magni- 
tude processing mechanisms. Any attempt to define 
groups as a function of broader measures of 
achievement would impede investigation of indi- 
vidual differences in executive function, which is 
known to correlate with academic achievement. 


Procedure 


All students assented and students’ families con- 
sented to participate, and the study was approved 
by the university’s institutional review board. 
Assessments were conducted by trained members of 
the research staff. The nonsymbolic number compar- 
ison task and executive function tasks were adminis- 
tered during the spring semester of the students’ 
sixth-grade year via tablet computer. Testing for 
mathematics achievement was completed in a quiet 
location at the students’ school with one-to-one 
assistance from trained staff during the student’s 
pre-K, kindergarten, first-grade, fifth-grade, sixth- 
grade, and seventh-grade years. Reading achieve- 
ment was assessed at the end of kindergarten. 


Cognitive Tasks 
Nonsymbolic Number Comparison 


Participants were presented with two sets of dots 
simultaneously and asked to indicate via button 
press which set was more numerous (i.e., which set 
contained more dots). The set on the left side of the 
screen contained yellow dots and the set on the right 
side contained blue dots, which corresponded to 
color-coded left and right buttons. Response sides 
were fully counterbalanced. Trials consisted of 
1,200 ms stimulus presentation followed by 1,800 ms 
of fixation (see Figure 1). Seven ratios were pre- 
sented, 0.33 (5 dots vs. 15 dots), 0.5 (5 vs. 10), 0.67 (6 
vs. 9), 0.8 (8 vs. 10), 0.86 (12 vs. 14), 0.88 (7 vs. 8), 0.9 
(9 vs. 10). The number of dots in each stimulus ran- 
ged from 5 to 15. Each ratio was presented 10 times 
for a total of 70 trials, which were preceded by six 
practice trials of the easiest two ratios. 

If individuals did not correctly respond to at 
least four of the six practice trials, practice trials 
were repeated up to two times. If participants did 
not answer four of six correctly on any practice 
run, they did not proceed to the experimental trials. 
Ratios, stimulus presentation times, and order of 
presentation were modeled after Odic, Hock, and 
Halberda (2014). To control for the possibility that 
participants might utilize a strategy based on visual 
cues rather than number of dots, the following 
visual properties of dot sets were varied using a 
modified version of the MATLAB code recom- 
mended by Gebuis and Reynvoet (2011): convex 
hull (area extended by a stimulus), total surface 
area (aggregate value of dot surfaces), average dot 
diameter, total circumference, and density (convex 
hull divided by total surface area). In 


Incongruent 
Ratio = 0.67 


1200 ms 


1800 ms Fixation 


Time 


Congruent 


1200 ms . 
Ratio = 0.67 


Figure 1. Nonsymbolic numerical magnitude comparison stimuli 
and paradigm timing. (A) Incongruent trial example of ratio 0.67 
(smaller number dot set/larger number dot set, 6/9 = 0.67). (B) 
Congruent trial example, also of ratio 0.67. 
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approximately one quarter of the trials (22 of 70), 
all four visual properties were congruent with 
greater numerosity (i.e., the greater number of dots 
had a greater convex hull, surface area, etc.). In 
another quarter of the trials (18 of 70), all four 
visual properties were incongruent with greater 
numerosity. In the remaining trials, visual proper- 
ties were mixed congruent and incongruent. 

Analyses of task effects include all trials. Analy- 
ses directly addressing the research questions 
include trials that were either fully congruent (22 
trials) or incongruent (18 trials) on all five visual 
parameters. Mixed congruency trials were excluded. 
Congruent versus incongruent trials per ratio are 
not perfectly balanced in trial numbers, but the 
average ratio for each is nearly identical (average 
ratio congruent = 0.733, average ratio for incongru- 
ent = 0.744; for further details, see Table $2). Perfor- 
mance was calculated as mean number of items 
correct and as a Weber fraction (Halberda et al., 
2008) to facilitate comparison with previously pub- 
lished research. However, the model implementing 
Levenberg—Marquardt least squares fit used to cal- 
culate Weber fractions did not provide a sufficient 
fit with the fewer number of trials available within 
congruency conditions (as indicated by whether the 
model predicted a significant amount of variance, 
p < .05). Furthermore, a growing body of literature 
suggests that mean accuracy is strongly correlated 
with and possibly more reliable than ratio-depen- 
dent metrics such as the Weber fraction (Gilmore, 
Attridge, & Inglis, 2011; Inglis & Gilmore, 2014), 
which is true even in the case of congruency com- 
parisons (Sztics, Devine, et al., 2013; Sztics, Nobes, 
et al., 2013). Therefore, in this study, mean accuracy 
percentages were used instead of Weber fractions 
to index performance on each of our number com- 
parison tasks. 


Working Memory 


The backward Corsi block-tapping test (Corsi, 
1972) provided a measure of visuospatial working 
memory. In this computerized task, children first 
viewed squares that lit up in a sequence on the 
screen, and then the students were asked to tap the 
squares in the reverse order in which they lit up. 
The task consisted of 16 total possible trials, includ- 
ing two practice trials. The student was given two 
attempts to correctly repeat the reverse sequence 
per sequence length, increasing in span from 2 to 8 
across the task. If the student correctly answered at 
least 1 of the 2 attempts correctly, the student then 
proceeded on to the longer (more difficult) 
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sequence. The score of interest was the highest span 
with a correctly repeated sequence. For some chil- 
dren without sixth-grade Corsi spans, 22 children 
of n= 448, fifth-grade spans are utilized. For 
details, see Appendix B. 


Inhibitory Control and Task Switching 


The hearts and flowers task (Wright & Diamond, 
2014) was used as measure of students’ task switch- 
ing and inhibitory control. In this task, the child 
was first presented with a heart on either side of 
the screen and instructed to press the button that 
corresponds to the side of the screen with the heart. 
This first block comprised 12 trials. In the second 
block of trials (also 12 trials), the child was pre- 
sented with flowers and asked to press the button 
that is opposite the side of the flower. In the third 
set of trials, the child was randomly presented with 
either a heart or a flower and asked to follow the 
rule that corresponds to hearts and flowers, respec- 
tively. The third block comprised 48 trials. To index 
executive function, we used mean accuracy from 
the third, mixed-condition block of trials, and as 
such, this single measure captures both task switch- 
ing and inhibitory control (Diamond, 2014). One 
child was not assessed at sixth grade for Hearts 
and Flowers, but a score from seventh grade was 
available. The same z-score method described above 
was utilized to create a score for this child and z- 
scores were utilized for all subsequent analyses. 


Academic Achievement 


Reading Achievement: Woodcock Johnson—III—Letter- 
Word Identification 


The Woodcock Johnson-II (WCJ-III; Woodcock, 
McGrew, & Mather, 2001) is a standard assessment 
of a range of skills, designed to be used with peo- 
ple ages 2-90+. The letter-word identification 
(LWID) subtest assesses children’s letter and sight 
word identification ability with the correct pronun- 
ciation. Items include identifying and pronouncing 
letters and words presented to the child (e.g., “A” 
or “dog”). Age-normed standard scores were calcu- 
lated as an early measure of reading achievement 
measured at the end of kindergarten and then con- 
verted to percentile ranks. 


Mathematics Achievement 


WCJ-III Quantitative Concepts and Applied 
Problems subtests were used as measures of 


mathematics achievement during the early school 
years (Pre-K-first grade) and KeyMath-3 (KM-3) 
subscales of Numeration, Algebra, and Geometry 
were used for the middle school time points (fifth— 
seventh grade). Standard scores from each measure 
were converted to percentile rank scores based on 
the nationally normed mean and standard devia- 
tions of the sample utilized for each respective stan- 
dardized assessment. Percentile rank scores were 
utilized for (a) achievement group creations based 
on percentile rank threshold in the first analysis 
and (b) the principal outcome variable of interest in 
our multilevel regression analysis. 
WCJ-Il—Quantitative Concepts and Applied Prob- 
lems. Quantitative Concepts and Applied Prob- 
lems subtests were administered at the end of each 
school year during Pre-K, kindergarten, and _ first 
grade. Individually administered, Quantitative Con- 
cepts has two parts and assesses students’ knowl- 
edge of mathematical concepts, symbols, and 
vocabulary, including numbers, shapes, and 
sequences; it measures aspects of quantitative math- 
ematics knowledge and recognition of patterns in a 
series of numbers. The Applied Problems subtest is 
an untimed verbal and picture-based measure of a 
student’s ability to analyze and solve mathematics 
problems, beginning with the application of basic 
number concepts. At each early time point, age- 
normed standard scores were calculated for each 
subtest and averaged together to create a composite 
measure of mathematics competence representing a 
broad range of mathematics skills. These scores 
were subsequently converted to percentile ranks. 
KeyMath-3. |The KM-3 Diagnostic Assessment 
(Connolly, 2007) is a comprehensive, norm-refer- 
enced measure of essential mathematical concepts 
and skills. It was administered at the end of each 
school year during fifth, sixth, and seventh grades. 
We used three subscales out of the five subscales in 
the Basic Concepts area. (a) Numeration: The 
Numeration subtest measures an_ individual’s 
understanding of whole and rational numbers. It 
covers topics such as identifying, representing, com- 
paring, and rounding one-, two-, and _ three-digit 
numbers as well as fractions, decimal values, and 
percentages. It also covers advanced numeration 
concepts such as exponents, scientific notation, and 
square roots. (b) Algebra: The Algebra subtest mea- 
sures an individual’s understanding of prealgebraic 
and algebraic concepts. It covers topics such as sort- 
ing, classifying, and ordering by a variety of attri- 
butes; recognizing and describing patterns and 
functions; working with number sentences, opera- 
tional properties, variables, expressions, equations, 


proportions, and functions; and representing mathe- 
matical relations. (c) Geometry: The Geometry sub- 
test measures an individual’s ability to analyze, 
describe, compare, and classify two and _three- 
dimensional shapes. It also covers topics such as 
spatial relations and reasoning, coordinates, sym- 
metry, and geometric modeling. Scale scores in the 
KM-3 are age normed to reflect population means 
of 10 (SD = 3) for each subtest. We averaged scale 
scores from the three subscales into a composite 
measure (KM composite) as in previous analyses 
involving the current sample (Price & Wilkey, 2017; 
Rittle-Johnson, Fyfe, Hofer, & Farran, 2017). This 
score was then converted to a percentile rank to 
compose mathematics achievement groups across 
measures of mathematics achievement in the early 
grades (Pre-K-first grade) and late measures of 
mathematics achievement (fifth grade to seventh 
grade). 

The relation between KM-3 scores and predictor 
variables was nonlinear based on visual inspection 
of scatter plots, so when conducting analyses that 
assumed a linear relation (e.g., bivariate correlation, 
partial correlation, or regression), models were fit 
using a transformed outcome (ie., cubed root) of 
KM-3 percentile rank. A detailed exploration of the 
untransformed achievement scores’ relation to pre- 
dictor variables is detailed in Appendix C. 


Analysis 


To investigate group differences among DD, LA, 
and TA groups on nonsymbolic comparison on 
both congruent and incongruent trials, we con- 
ducted a two-way (3 x 2), mixed effects analysis of 
variance (ANOVA) with achievement group as a 
between-subject factor, congruency condition of 
nonsymbolic comparison as a within-subjects factor, 
and accuracy rate on the nonsymbolic comparison 
task at sixth grade as the dependent variable. 
Levene’s tests were run for each ANOVA to ana- 
lyze violations of homogeneity of variance that 
often results from unequal sample sizes. When vio- 
lated, Welch’s adjusted F was used for the ANOVA 
and noted in the results. One-way post hoc t-tests 
were conducted to examine simple main effects and 
pairwise differences where appropriate. Bonferroni- 
corrected p-values are reported to correct for multi- 
ple comparisons for all subsequent analyses and to 
ensure tests were robust against violations of homo- 
geneity of variances between groups. Effect sizes 
are reported as Hedge’s g, which accounts for 
unequal group ns by weighting the pooled stan- 
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Because clustering of students within schools did 
not account for a significant proportion of variation 
in sixth-grade nonsymbolic number comparison 
accuracy (p = .009, p = .74), a multilevel modeling 
approach to account for the clustering of students 
within schools was not needed. 

The second set of analyses used random effects 
multilevel models to predict sixth-grade mathemat- 
ics achievement from concurrent experimental mea- 
sures. This analysis examined whether individual 
differences in nonsymbolic number comparison per- 
formance related to standardized mathematics 
achievement across a wide range of achievement. 
Specifically, we examined whether sixth graders’ 
accuracy on nonsymbolic number comparison for 
incongruent and congruent trials predicted concur- 
rent mathematics achievement for the full sample of 
students (n = 448), and whether the relation chan- 
ged when controlling for early reading achievement 
and domain-general executive functioning. 


Results 
Task Effects 


Nonsymbolic comparison task performance pro- 
files were consistent with previous findings (e.g., 
Lyons, Nuerk, & Ansari, 2015), showing a signifi- 
cant effect of ratio on mean accuracy for all trials [F 
(6, 447) = 1,255.22, p < .001, partial 1” = .737], and 
within congruency conditions [F(6, 447) = 339.01, 
p < .001, partial nv = 431 for congruent trials; F(6, 
447) = 401.17, p < .001, partial nv = 473 for incon- 
gruent trials]. Furthermore, both mean accuracy 
and Weber fraction were correlated with mathemat- 
ics achievement at sixth grade (mean accuracy Pear- 
son 17(446) = .191, p< .001, 95% CI [.100, .278]; 
Weber fraction Pearson 1r(446) = —.244, p < .001), 
95% CI [—.329, —.155], which is in line with a 
recent meta-analysis reporting an average correla- 
tion of r= .241 (k= 195) between nonsymbolic 
comparison and a broad range of mathematics 
achievement measures across multiple age groups 
(Schneider et al., 2017). Mean accuracy and Weber 
fractions were highly correlated (Pearson r 
(446) = —.919, p < .001), 95% CI [—.932, —.903]. 


Achievement Group Comparison Results 


Results of the two-way ANOVA indicated a 
main effect of achievement group [F(2, 219) = 6.694, 
p = .002, partial n* = .058], a main effect of congru- 
ency [F(1, 219) = 27.570, p < .001, partial Hr =.112] 
whereby individuals were more accurate on 
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congruent trials, and an_ interaction  [F(2, 
219) = 4.816, p = .009, partial 1° = .042]. To charac- 
terize the main effect of achievement group, we 
conducted between-subjects t-tests comparing accu- 
racy on the combined congruent and incongruent 
trials. Accuracy rate was 6.7 points [95% CI: 2.6, 
10.9] lower for the DD group than the LA group [t 
(32) = —3.293, Bonferroni adjusted («/3) p = .003, 
unadjusted p < .001, Hedge’s g = 1.182] and 4.3 
points [95% CI: 2.0, 6.5] lower for the DD group 
than the TA group [#208) = —3.761, Bonferroni 
adjusted (a#/3) p=.002, unadjusted p< .001, 
Hedge’s g = 0.847]. There was no significant differ- 
ence between the LA and TA_ groups [t 
(198) = 1.619, Bonferroni adjusted («/3) p = .161, 
unadjusted p = .053, Hedge’s g = 0.482]. 


The Effect of Congruency 


Pairwise comparisons were conducted to charac- 
terize the simple effect of congruency within 
achievement groups. There was an effect of congru- 
ency in the DD and TA groups whereby, on aver- 
age, the DD group accuracy rate was 25.2 points 
[95% CI: 16.6, 33.8] lower for incongruent compared 
to congruent trials [t21) = 6.076, Bonferroni 
adjusted («#/3) p<.001, unadjusted p< .001, 
Hedge’s g = 2.203] and the accuracy rate for the TA 
group was 10.7 points [95% CI: 7.6, 13.8] lower for 
incongruent compared to congruent trials [ft 
(21) = 6.795, Bonferroni adjusted («/3) p < .001, 
unadjusted p < .001, Hedge’s g = 0.844]. However, 
there was no effect of congruency in the LA group 
[t(11) = 0.716, Bonferroni adjusted («/3) p = .732, 
unadjusted p = .244, Hedge’s g = 0.359] (see Fig- 
ure 2 and Table 1 for means). 


The Effect of Achievement Group 


To characterize the simple effects of achievement 
group, one-way analyses of variance (ANOVAs) 
were conducted within congruency conditions, fol- 
lowed by pairwise comparisons of achievement 
groups. Results from the ANOVA on accuracy for 
congruent trials showed no effect of achievement 
group [F(2, 219) = 0.476, p = .622, n° = .004] (Fig- 
ure 2). Levene’s test of equality of variances 
showed no significant differences in variance across 
groups for mean accuracy of congruent trials 
(Levene’s statistic = 0.383, p = .682). 

In contrast, results from the ANOVA on incon- 
gruent trials showed a significant effect of achieve- 
ment group on accuracy Welch’s _ [F(2, 
21.45) = 8.345, p = .002, n° = .070]. Levene’s test 
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Figure 2. Nonsymbolic number comparison accuracy rates by 
achievement group. DD = developmental dyscalculia; LA = low 
achieving; TA = typically achieving. Error bars represent stan- 
dard errors. p-Values are indicated for differences in accuracy 
between congruent and incongruent trials (***p < .001). 


indicated significant differences in variance across 
groups for mean accuracy of incongruent trials 
(Levene’s statistic = 4.317, p = .014); however, vari- 
ance only differed between groups by a factor of 
2.56 at most, so Welch’s adjusted F was used for 
the ANOVA. After adjusting for multiple compar- 
isons, post hoc tests of incongruent trials indicated 
that accuracy rate for the DD group was 17.3 points 
[95% CI: 5.2, 29.4] lower than the LA group [t 
(32) = —2.916, Bonferroni adjusted («/3) p = .002, 
unadjusted p = .005, Hedge’s g = 1.046] and DD 
accuracy rate was 12.1 points [95% CI: 5.9, 18.3] 
lower than the TA group [f(208) = —3.862, Bonfer- 
roni adjusted («/3) p< .001, unadjusted p < .001, 
Hedge’s g = 0.870]. There was no_ difference 
between LA and TA groups [#(198) = 1.197, Bonfer- 
roni adjusted («/3) p = .350, unadjusted p = .117, 
Hedge’s g = 0.356, mean difference = 5.2 points, 
95% Cl: —3.3, 13.7]. 

To further investigate achievement group differ- 
ences after controlling for domain-general factors, 
analyses were repeated as a one-way analysis of 
covariance (ANCOVA) with the covariates of max 
span achieved on the backward Corsi, mean accu- 
racy during mixed trials of the hearts and flowers 
task, age at time of testing, and percentile rank on 
the WCJ-III LWID at the end of kindergarten. After 
controlling for these factors, there was still a signifi- 
cant effect of achievement group for accuracy on 
incongruent trials [F(2, 215) = 4.658, p = .010, par- 
tial n° = .042]. After adjusting for multiple compar- 
isons, covariate adjusted means were 16.6 points 
[95% CI: 3.3, 29.9] lower for the DD than the LA 
group [Bonferroni adjusted («/3) p = .015, unad- 
justed p = .005, Hedge’s g = 0.823] and 10.0 points 


[95% CI: 1.0, 21.0] lower for the DD group than the 
TA group [Bonferroni adjusted (a/3) p = .045, 
unadjusted p = .002, Hedge’s g = 0.823]. There was 
no significant difference between the LA and TA 
groups [Bonferroni adjusted («/3) p = .231, unad- 
justed p = .077, Hedge’s g = 0.585]. These results 
replicate the pattern observed in the ANOVA. 

In sum, all ANOVAs and ANCOVAs conducted 
show the same pattern of results whereby: (a) no 
group differences are observed for congruent trials 
of the comparison task, (b) the DD group performs 
significantly below LA and TA groups on incongru- 
ent trials even when controlling for other cognitive 
factors and early reading achievement, and (c) no 
group differences are present between LA and TA 
groups on incongruent trials. 


Full Range of Achievement Results 


For descriptive statistics of the full sample, see 
Table 2. For bivariate correlations among measures, 
see Table S3. Of note is a moderate, negative bivari- 
ate correlation between accuracy rates for congruent 
and incongruent trials, r(446) = —.447, p < .001, 95% 
CI [—.518, —.369]; see Figure S1 for scatter plot. To 
investigate potential differences among subtests of 
the KM-3 and their correlations with performance in 
the nonsymbolic number comparison task, Pearson-r 
values were converted to z values and then com- 
pared with a two-tailed z-test. Results indicated 
there were no significant differences among any cor- 
relations according to KM-3 subtests (all ps > .435, 
see Table S4 for details) and all further analyses were 
conducted on KM-3 composite scores. 


Multilevel Regression Model Predicting Mathematics 
Achievement 


Multilevel modeling accounts for the clustering 
of students within schools, as approximately 23% of 
the variation in sixth-grade mathematics achieve- 
ment was due to school membership (p = .225, 
p < .0001). Equation 1 illustrates the modeling 
approach, in which MATH; represents sixth-grade 
mathematics achievement for each student i in 
school j. The predictors INCON; and CON; repre- 
sent student-level accuracy on nonsymbolic number 
comparison for incongruent and congruent trials, 
respectively; HAF; represents student-level stan- 
dardized scores on the hearts and flowers task; 
CORSI; represents student-level standardized back- 
ward Corsi max span scores; READ; represents stu- 
dent-level age-normed standard scores on the 
LWID test; and Xj; represents a vector of potential 
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Table 2 
Descriptive Statistics for Experimental and Standardized Measures for 
Full Sample 


Entire sample (n = 448, 
250 female) 


M SD Range 


Age (years), sixth grade 12.0 0.32 11.4-13.4 
Nonsymbolic comparison (accuracy, 74.8 5.48 48.6-91.4 
%) 


Nonsymbolic comparison (congruent 76.6 11.2 364-100 
accuracy, %) 

Nonsymbolic comparison 63.1 14.5 22.2-94.4 
(incongruent accuracy, %) 

Nonsymbolic comparison (Weber 0.29 0.10 0.10-1.42 
fraction, w) 

Backward Corsi (max span)* 4.81 1.22 2-8 

Hearts and flowers (accuracy, %)* 73.4 14.5 35-100 

Letter-word identification—WCJ-II1 109.7. 12.7 73-144 
(K, percentile rank) 

Math achievement—KM-3 (sixth 27.0 23.1 0.5-92.5 


grade, percentile rank) 


Note. WCJ-III = Woodcock Johnson-III; KM-3 = KeyMath-3. 
“Raw scores reported here for year available. See Sections Work- 
ing Memory, Inhibitory Control and Task Switching, and Appendix C 
for a detailed description of scores used for analyses. 


student-level covariates, such as gender or age at 
testing. Due to nonlinearity in the relation between 
mathematics scores and the predictors, models were 
fit using a transformed outcome (i.e., cubed root). 


9, MATH = By + B,INCON; + B,CON; 
+ B3;HAF; =e BsCORSI; + B;READj 
+ BeX ij + (ei + uj). 


(1) 


The bivariate correlations of the transformed 
achievement variable are presented in Figure 3 with 
a plot of nonsymbolic number comparison perfor- 
mance by congruency on achievement. 

Table 3 presents parameter estimates, standard 
errors, significance levels, random effects, and 
goodness-of-fit statistics for a taxonomy of fitted 
models describing the relation between mathemat- 
ics achievement and nonsymbolic number compar- 
ison, domain-general executive functioning, early 
reading achievement, and age at testing in sixth 
grade. The first model (i.e., M1) displays the grand 
mean of sixth-grade mathematics achievement, 
across all students and schools, and the intraclass 
correlation (p = .225, p < .0001) that motivates the 
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Figure 3. Nonsymbolic number comparison accuracy rates split by (left) congruent and (right) incongruent trials including all individu- 
als from the full sample plotted against *\/MATH;j, the outcome variable of Equation 1 below, cube root of the composite math 
achievement percentile rank. DD = developmental dyscalculia; LA = low achieving; TA = typically achieving; pr = percentile rank. 
Bivariate correlations of the full sample are presented in the bottom corner of each panel (***p < .001). Orange diamonds represent 
individuals who did not fit our selection criteria for stable achievement grouping based on pre-K to seventh-grade achievement. 


multilevel modeling approach. Model M2 shows 
the relations between accuracy on congruent and 
incongruent conditions of the nonsymbolic number 
comparison task and transformed sixth-grade math- 
ematics achievement. There is a statistically signifi- 
cant relation between accuracy on incongruent 
nonsymbolic number comparison and transformed 
sixth-grade mathematics achievement (z = 4.88, 
p < .0001), but accuracy on congruent trials is not a 
statistically significant predictor of mathematics 
achievement (z = 1.16, p = .25). Accordingly, accu- 
racy on congruent trials was excluded from subse- 
quent models. 

Subsequent models (M3—M5) show that the rela- 
tion between accuracy on incongruent trials of the 
nonsymbolic number comparison task and _trans- 
formed sixth-grade mathematics achievement per- 
sists after controlling for additional predictors of 
mathematics achievement. Model M3 shows the 
relation between accuracy on incongruent nonsym- 
bolic number comparison trials and transformed 
mathematics achievement, controlling for domain- 
general executive functioning. Hearts and Flowers 
and backward Corsi performance have a statisti- 
cally significant relation with mathematics achieve- 
ment (z = 7.71, p< .0001 and z=7.12, p < .0001, 
respectively), controlling for nonsymbolic number 
comparison. Parameter estimates and statistical sig- 
nificance of relations remain stable when control- 
ling for reading performance in kindergarten (see 
Table 3, M4) and age of mathematics testing in 
sixth grade (see Table 3, M5), though the 


magnitudes decrease slightly. Additional models 
were fit testing demographic variables (e.g., gender) 
and interaction terms among the nonsymbolic com- 
parison and executive function predictors; however, 
none were statistically significant (p’s ranged from 
.06 to .98). Furthermore, we conducted a sensitivity 
analysis to examine whether students with DD may 
be driving the relationship between performance on 
incongruent trials and mathematics achievement. 
To do so, we refit model M5 without the DD sub- 
group (n = 22). Results were unchanged. Taken 
together, the analysis suggests that student perfor- 
mance on incongruent trials of nonsymbolic num- 
ber comparison is predictive of concurrent 
mathematics achievement, above and beyond non- 
numerical, domain-general executive functioning, 
early reading achievement, and age at testing in 
sixth grade. For detailed explanation of the model 
fit, see Appendix C. 


Discussion 


This study investigated the relation among ANS 
function, executive function, and mathematics 
achievement by examining performance on the non- 
symbolic number comparison task, separately for 
congruent and incongruent trials, while controlling 
for multiple components of executive function mea- 
sured in non-numerical contexts. We investigated 
this relation first as it relates to group differences 
among DD, LA, and TA students and then as a 
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factor related to mathematics across a full range of 
achievement. Results indicated that a dynamic 
interplay of the ANS and executive function mecha- 
nisms, beyond either mechanism alone, represents a 
deficit specific to DD and is also related to mathe- 
matics across a full range of mathematics achieve- 
ment. Together, the current findings suggest that a 
focus on ANS alone is insufficient to explain the 
relation between basic number processing and 
mathematics outcomes. Therefore, we suggest that 
our results point to a need to reframe existing mod- 
els of the relation between number processing and 
mathematics competence to include the relation 
between executive function mechanisms and magni- 
tude processing, and to move beyond single mecha- 
nism explanations more generally. 


Achievement Group Comparison 


In the first analysis, we compared accuracy rates 
in the nonsymbolic comparison task across three 
mathematics achievement levels (i.e, DD, LA, and 
TA) defined through 6 years of consistent achieve- 
ment, including the first 3 years of school entry 
(pre-K-first grade) and 3 later years of entry to mid- 
dle school (fifth-seventh grade). Our results showed 
that accuracy on incongruent trials, and not congru- 
ent trials, was significantly lower for DD (defined 
at two different thresholds) compared to LA and 
TA groups, even after controlling for early reading 
achievement, visuospatial working, inhibitory con- 
trol, and task shifting. LA and TA groups, on the 
other hand, did not differ from one another, thus 
supporting the hypothesis that an impairment in 
the interaction between executive function and the 
ANS is characteristic of individuals with DD. 

Explanations of the link between ANS and math- 
ematics achievement that involve a dynamic inter- 
action between the ANS and executive function 
have considerable support from a large body of 
research linking low mathematics performance with 
various executive function impairments. These 
include associations between low mathematics 
achievement and inhibitory control (Blair & Razza, 
2007; Espy et al., 2004; Sztics, Devine, et al., 2013; 
Szucs, Nobes, et al., 2013), spatial processing 
(Rourke & Conway, 1997), verbal and visuospatial 
working memory (Bull & Lee, 2014; Bull & Scerif, 
2001; Geary, 2004; Lee & Bull, 2016; Sztics, Devine, 
et al., 2013; Sztics, Nobes, et al., 2013), set shifting 
(Willcutt et al., 2013), sustained visual attention 
(Anobile, Stievano, & Burr, 2013), and inattentive 
behaviors (Fias et al., 2013; Shalev et al., 1995). Fur- 
thermore, DD has a high rate of comorbidity with 


attention-deficit/hyperactivity disorder (Czamara 
et al. 2013). Although the link is often made 
between general measures of executive function 
and mathematics achievement, there is evidence 
that the relation is specific to measures of executive 
function involving numerically relevant informa- 
tion. For example, Siegel and Ryan (1989) found 
that individuals with DD have impairments of 
working memory related to processing numerical 
information and not language. Experimental studies 
have also demonstrated a distinction between exec- 
utive function to numerical and non-numerical con- 
tent. Ashkenazi et al. (2009) found that individuals 
with DD had more difficulty recruiting attention to 
numerical information but not non-numerical infor- 
mation under heightened cognitive load compared 
to TD peers. This array of findings has led some to 
suggest that DD may involve a domain-specific 
executive function problem (e.g., Bull & Scerif, 
2001). In other words, individuals with DD may 
not have a generally impaired ANS system, but 
rather have difficulty working with numerical mag- 
nitudes under additional executive function 
demands. Results from this study showing mathe- 
matics achievement group differences in nonsym- 
bolic comparison performance only during 
incongruent trials, after controlling for non-numeri- 
cal executive function, lend further support to this 
hypothesis. Whether this deficit is driven by a fail- 
ure to upregulate numerical information above 
competing information as attention shifting would 
require, or perhaps a failure to disengage attention 
from non-numerical information by inhibiting inter- 
ference from irrelevant stimulus dimensions 
remains an open empirical question. As the DD 
group’s average performance during incongruent 
trials is around chance, little can be inferred with 
about strategy during these trials. 

This study’s results contrast with some previous 
studies using an alternative method for controlling 
visual parameters of dot stimuli, which have not 
found an effect of congruency on response behav- 
iors (Odic, Libertus, Feigenson, & Halberda, 2013; 
Odic et al., 2014). However, in those studies, the 
effect of congruency may be confounded by the fact 
that degree of visual congruency (and incongru- 
ency) is linearly related to trial ratio. This means 
that in difficult ratio trials, which capture the most 
variance related to individual differences in ANS 
acuity, each dot set is very similar in terms of sur- 
face area, thus decreasing the likelihood of finding 
a congruency effect. Although this method may be 
appropriate for measurement of general ANS acu- 
ity, the effects of congruency are difficult to 


separate from the effects of numerical ratio, because 
the two are linked so tightly. This study uses a 
method of controlling congruency that is more bal- 
anced across ratios and controls for a greater num- 
ber of stimulus properties beyond dot size and 
surface area (for a detailed discussion, see Clayton 
et al., 2015). Therefore, the effects of congruency 
and ANS function are more clearly disentangled in 
this study. 

One unexpected result from the first, group-wise 
analysis is that DD and TA groups showed congru- 
ency effects, as expected, but LA children did not. 
Despite this lack of a congruency effect in the cur- 
rent findings for this achievement group, we cau- 
tion against any strong interpretation of this result. 
There is a trend in the expected direction for each 
of the LA children groupings (10th percentile and 
6.7th percentiles cutoffs), in which children are 
more accurate on congruent trials than incongruent 
trials. Despite the lack of a significant effect, the 
effect sizes are relatively large (Hedge’s g = 0.359 
and Hedge’s g = 0.71), and mean differences are 6 
accuracy points and 10 accuracy points for each 
sample, respectively. It is likely that the absence of 
a statistically significant congruency effect for LA 
children is due to high variance in accuracy on 
incongruent trials and a lack of power for this com- 
parison. 


Full Range of Achievement 


In the second analysis, we examined whether 
sixth graders’ accuracy on nonsymbolic number 
comparison for incongruent and congruent trials 
predicted concurrent mathematics achievement for 
the full sample of students and whether the relation 
changed when controlling for early reading achieve- 
ment and non-numerical, domain-general executive 
functioning. The sample for this analysis included a 
wide range of mathematics achievement levels that 
included all participants from the first analysis and 
participants in the broader study that did not con- 
sistently achieve in the same level year to year. 
Similar to the logic of the first analysis, if number- 
specific executive function is related to individual 
differences in mathematics achievement across a 
wide range of achievement, performance on incon- 
gruent trials should predict mathematics achieve- 
ment beyond what can be accounted for by 
congruent trials and early reading achievement, 
visuospatial working memory, inhibitory control, 
and task shifting. Indeed, results showed that accu- 
racy on incongruent trials predicted concurrent 
mathematics achievement even after controlling for 
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early reading achievement, visuospatial working 
memory, inhibitory control, and task shifting, thus 
supporting the hypothesis that number-specific 
executive function relates to individual differences 
in mathematics achievement across a wide range of 
achievement levels. Furthermore, the relation 
remained unchanged when we excluded individu- 
als with DD from the regression. These findings 
build on previous research that has shown other 
number-specific measures of executive function 
relate to mathematics achievement in TD and high 
achieving groups. For example, Dark and Benbow 
(Dark & Benbow, 1994) found that working mem- 
ory tasks with numerical stimuli were more closely 
related to mathematical precocity than non-numeri- 
cal stimuli across a range of tasks in adults. Simi- 
larly, studies of children have demonstrated that 
inhibitory control and working memory of numeri- 
cal information accounts for significant variance in 
individual differences in mathematics ability and 
early numeracy beyond similar non-numerical mea- 
sures of executive function (Bull & Scerif, 2001; 
Merkley, Thompson, & Scerif, 2016). 

Interestingly, bivariate correlations indicated that 
children with high accuracy on incongruent trials 
tended to have low accuracy on congruent trials 
(and vice versa), even though congruent trials were 
not related to mathematics achievement. This may 
be important for two reasons. First, if only incon- 
gruent trials are related to mathematics achieve- 
ment, researchers may be tempted to design 
measures consisting exclusively of incongruent tri- 
als. However, this inverse relation may indicate that 
incongruent trials are inherently related to congru- 
ent trials such that removing congruent trials would 
change the nature of the task demands for incon- 
gruent trials. Second, speculation about inhibitory 
control has dominated the conversation about the 
cognitive mechanisms underlying the difference 
between incongruent trials and congruent trials of 
the nonsymbolic comparison task (Cragg et al., 
2017; Gilmore et al., 2015). Although inhibitory con- 
trol may be a factor, the inverse correlation between 
congruency conditions may indicate that some indi- 
viduals are unable to switch between strategies that 
capitalize on visual cues during congruent trials 
and ignore these cues otherwise. In addition to 
working memory and inhibitory control, task shift- 
ing may contribute to differences in performance 
between incongruent and congruent trials. Third, 
this inverse correlation is somewhat consistent with 
a developmental account recently suggested by 
Piazza, De Feo, Panzeri, and Dehaene (2018), 
whereby development and education both correlate 
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with an increased ability to filter our irrelevant cues 
in incongruent number comparison trials, similar to 
those in this study. In contrast, performance on 
congruent trials dropped or remained the same 
with increased education and age, suggesting there 
was not a generalized increase in acuity of number 
perception. Piazza et al.’s developmental findings 
suggests that better performers on incongruent tri- 
als may not benefit as much from congruent visual 
cues, which may explain this inverse correlation. 

Given recent work demonstrating the contribu- 
tion of numerosity discrimination to math achieve- 
ment compared to non-numerical bias (Starr et al., 
2017), we did expect to see a relation, albeit weaker, 
between mathematics achievement and accuracy 
rate on congruent trials. However, in this study, 
accuracy on congruent trials was unrelated to math- 
ematics achievement, either as a factor distinguish- 
ing between achievement groups or as a predictor 
of mathematics achievement. This was true even 
before controlling for other academic or cognitive 
factors. Furthermore, the magnitude of this relation 
in both analyses was close to zero, showing no 
trend in the expected direction. This calls into ques- 
tion whether ANS function alone, not measured 
under high executive function demands, is an 
important factor related to DD and mathematics 
achievement more generally. Studies showing no 
relation between nonsymbolic number comparison 
performance and math achievement after control- 
ling for executive function have argued this point. 
For example, in a large sample of TD children, 
Szucs, Devine, Soltesz, Nobes, and Gabriel (2014) 
found that after controlling for other executive 
function measures such as dot matrices, visuospa- 
tial working memory, and the trail-making task, 
nonsymbolic comparison did not significantly relate 
to mathematics achievement. Interestingly, in that 
study, sustained visual attention was the best corre- 
late of ANS acuity, which may further indicate that 
attention mechanisms and ANS mechanisms are 
integrally related. 

Previous neuroimaging research has shown that 
congruent and incongruent trials of the nonsym- 
bolic number comparison task recruit different neu- 
ral mechanisms, with incongruent trials recruiting 
large portions of the frontoparietal attention net- 
work (Leibovich, Vogel, Henik, & Ansari, 2016; 
Wilkey, Barone, Mazzocco, Vogel, & Price, 2017). 
Recruitment of additional neurocognitive mecha- 
nisms during incongruent trials may be an integral 
component of the previously assumed direct rela- 
tion between ANS and mathematics achievement in 
studies of mathematics learning disability, but also 


across the full range of achievement. Supporting 
this interpretation, recent neuroimaging evidence 
from Wilkey and Price (in press) shows that indi- 
vidual differences in neural activity of inferior fron- 
tal brain regions, indexing the numerical 
congruency effect in the nonsymbolic comparison 
task, predicted mathematics achievement in a TD 
sample of third- and fourth-grade children. This 
relation held even after controlling for neural activ- 
ity in a Flanker task and domain-general cognitive 
factors. In contrast, individual differences in the 
ratio effect (a neural metric of numerical acuity) did 
not relate to mathematics, including activity in 
expected posterior parietal regions. This finding 
underscores the importance of the neurocognitive 
mechanisms that interact with magnitude process- 
ing mechanisms for mathematics competence, and 
again speaks to the need to move beyond a single 
mechanism explanation of foundational competen- 
cies for mathematics development. 


Limitations and Future Directions 


Several factors should be taken into account 
when interpreting the results of this study. First, 
participants were recruited from an urban public 
school system and were mostly from low-income 
households. Low household income often impedes 
access to high-quality early mathematics experiences 
(Ramani & Siegler, 2008), so factors driving the rela- 
tion between nonsymbolic comparison and mathe- 
matics achievement may differ across students with 
differing household incomes. Furthermore, the rela- 
tion between nonsymbolic comparison and mathe- 
matics achievement in low-income samples has been 
reportedly lower than middle- and high-income 
samples (Fuhs, Kelley, O’Rear, & Villano, 2016; Fuhs 
& McNeil, 2013). However, effect sizes of the rela- 
tion between nonsymbolic comparison and mathe- 
matics achievement from this study are in line with 
previous meta-analyses (Chen & Li, 2014; Schneider 
et al, 2017). Additionally, the lack of relation 
between mathematics achievement and congruent 
trials, and significant relation between mathematics 
achievement and incongruent trials, has been previ- 
ously reported in low-income (Fuhs & McNeil, 
2013) and middle-to-high income individuals (Keller 
& Libertus, 2015). Furthermore, Price and Wilkey 
(2017) showed that the mediating relation among 
nonsymbolic comparison accuracy rates and mathe- 
matics achievement in the same group of children 
as this study follows the same patterns as previ- 
ously reported literature from wider socioeconomic 
status samples (Lyons & Beilock, 2011). 


Second, alternative explanations of the current 
results are possible. For example, rather than our 
hypothesis about domain-specific executive func- 
tion, the current results could indicate that individ- 
uals who utilize an appropriate strategy for 
incongruent trials, whether consciously or not, are 
better at mathematics. If framed as a task strategy, 
then strategy selection does not necessarily equate 
to number-specific executive function. Another 
alternative is that individual differences in task per- 
formance are based not on cognitive efficiencies but 
rather a predisposition to focus on one aspect of the 
visual stimuli. A deficit of number-specific execu- 
tive function is different than the failure to utilize 
it. Prior research has documented that individuals 
with a tendency to spontaneously focus on exact 
quantities have higher arithmetic abilities (Batche- 
lor, Inglis, & Gilmore, 2015; Hannula, Lepola, & 
Lehtinen, 2010). Recently, this line of research has 
been expanded to incorporate spontaneous orienta- 
tion to conflicting or irrelevant dimensions of non- 
numerical magnitude similar to those of this study 
(Viarouge et al., 2017). Research on the underlying 
neurocognitive mechanisms can also help to distill 
the root of the differences observed in the current 
results. 

A third factor to consider, specifically in regard 
to the group comparison results, is that choosing to 
identify a DD group based on consistent, low math- 
ematics achievement over time has both benefits 
and limitations for interpreting results. In our meth- 
ods, we make an argument that DD is likely 
heterogenous in nature and that identifying a “pure 
dyscalculic” group via the use of an IQ-math 
achievement discrepancy criteria results in the 
exclusion of individuals with DD that do not show 
a discrepancy due to the high comorbidity of other 
developmental deficits that would affect IQ or 
another achievement measure such as reading. With 
this analytic decision comes the limitation that 
some individuals within the DD group may per- 
form poorly in mathematics testing due to a more 
globalized cognitive deficit (e.g., IQ) rather than a 
specific math deficit and, further, that this global 
deficit was not adequately controlled for when 
covarying out reading ability and executive func- 
tion. One solution suggested by Sztics (2016) may 
be to focus more on positioning individuals in a 
multidimensional parametric space that identifies 
specific cognitive functions related to mathematical 
performance. The current results suggest that num- 
ber-specific executive function is likely to be one 
such cognitive function. 
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Fourth, this study makes the case that number- 
specific executive function may be impaired in DD 
and also related to mathematics achievement across 
a wide achievement range. This conclusion is based 
on the idea that a relation between two variables 
(i.e., math achievement and performance on incon- 
gruent trials of the number comparison task) sur- 
vives after controlling for individual differences in 
other cognitive factors (i.e., executive function in a 
non-numerical context). In this type of analysis, the 
conclusion is only as strong as the validity and 
specificity of control variables. In this study, only 
two variables are used as control measures for exec- 
utive function, and therefore, caution is warranted 
when considering how completely our variables 
controlled for all aspects of executive function unre- 
lated to number. 


Conclusion 


In sum, the two sets of analyses presented here 
suggest that performance on incongruent trials 
alone relates to the presence of severe mathematics 
learning deficits as well as individual differences 
in mathematics across a wide range of achieve- 
ment, even when excluding DD _ individuals. 
Results suggest that mnumber-specific executive 
function is a unique predictor of mathematics 
achievement beyond measures that target the ANS 
or executive function independently. In order to 
understand how the intersection of these multiple 
cognitive mechanisms relates to the acquisition of 
mathematics skills, future studies should move 
from a domain-specific versus domain-general 
approach to experiments that deconstruct this 
framework. In so doing, future hypotheses can 
more closely address the integration of cognitive 
mechanisms required to complete a complex task 
such as mathematical thought. Furthermore, the 
current findings do little to explain the relation 
between nonsymbolic number perception and sym- 
bolic number. Understanding their relation may 
further explain why number-specific executive 
function relates to symbolic mathematics. This type 
of investigation may lead to an enhanced under- 
standing of what type of training or remediation 
of a specific skill set provides the most potential 
for transfer to improved mathematics achievement 
more broadly. Given that this study provides sup- 
port for an integral relation between a “domain- 
general” mechanism with a number-specific one, a 
training that seeks to leverage this intersection 
should be explored. 
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Appendix A 
Detailed Results From 6.7th Percentile Cutoff 
Sample Achievement Group Analysis 


To make current results more easily comparable 
to previous literature that used differing cutoff 
thresholds for determining DD groups, this study 
also examined whether there were differences 
between two commonly used thresholds for deter- 
mining a dyscalculic sample. This threshold has 
varied widely across studies, and has likely con- 
tributed to disagreement among findings (Maz- 
zocco & Myers, 2003). Another commonly used 
threshold is mathematics achievement scores 
1.5 SDs below the nationally normed means, which 
is equivalent to performance below the 6.7th per- 
centile (Kaufmann et al., 2013; Price et al., 2007; 
Rotzer et al., 2009). This threshold resulted in the 
following achievement groupings: DD, < 6.7th per- 
centile; LA, 6.7th—-25th percentile; TA, 25th—95th 
percentile. Again, individuals were placed in 
achievement groups if their mathematics achieve- 
ment scores were consistently in the designated 
achievement range at two of the three early assess- 
ments (pre-K-first grade) AND two of the three 
later assessments (fifth-seventh grades). Given 
these criteria, 221 children fit into consistent 
achievement groups across early and later assess- 
ment periods, 11 children met the criteria for DD, 
22 for LA, and the same 188 children were TA. 
Descriptive statistics in Table A1. 


Results 


As in the first achievement group sample, there 
were no differences according to gender distribu- 
tion percentages of mathematics achievement 


groups with the 6.7th percentile cutoff grouping, 
Pearson °(2) = 4.045, p = .132, Cramer’s V = .132, 
nor in mathematics achievement, 1t(446) = 1.182, 
p = .238, Cohen’s d = 0.112, or in nonsymbolic com- 
parison accuracy, t(446) = 0.780, p = .436, Cohen’s 
d = 0.074, at sixth grade, the outcome year of inter- 
est for the second set of primary analyses. 


Detailed Results From 6.7th Percentile Cutoff Sample 
Achievement Group Analysis 


For the 6.7th percentile cutoff sample, there was 
an effect of congruency in the DD and TA groups [¢ 
(10) = 3.855, p = .003, Cohen’s d = 1.968 for DD; t 
(187) = 6.795, p < .001, Cohen’s d = 0.844 for TA], 
but not in the LA group [#(21) = 0.705, p = .068, 
Cohen’s d = 0.705]. The right panel of Figure Al 
shows the congruency effect for DD and TA groups 
in the 6.7 percentile cutoff sample. Levene’s test of 
equality of variances showed no significant differ- 
ences in variance across groups for mean accuracy of 
congruent trials or incongruent trials. Results from 
the ANOVA showed that there was no effect of 
achievement group on number comparison perfor- 
mance for congruent trials [F(2, 218) = 0.389, 
p = .679, vr = .003], but there was a significant effect 
of achievement group on number comparison perfor- 
mance for incongruent trials [F(2, 218) = 4.947, 
p = .008, n* = .043]. After adjusting for multiple 
comparisons, one-tailed post hoc tests indicated 
lower accuracy rates for DD than TA children (Bon- 
ferroni adjusted p = .003, Hedge’s g = 0.997), lower 
accuracy rates for DD than LA children (Bonferroni 
adjusted p = .011, Hedge’s g = 0.821), and no differ- 
ence between LA and TA groups (Bonferroni 
adjusted p = .500, Hedge’s g = 0.028). 

Results from the ANCOVAs with the covariates 
of mean accuracy on the Hearts and Flowers mixed 
trials, max span on the backward Corsi block-tap- 
ping test, age at Grade 6 testing, and letter-word 
identification at the end of kindergarten indicated 
there was no effect of achievement group on num- 
ber comparison performance for congruent trials [F 
(2, 214) = 0.208, p = .812, partial n? = .002], but 
there was a significant effect for incongruent trials 
[F(2, 214) = 3.356, p = .037, partial n? = .030]. After 
adjusting for multiple comparisons, one-tailed post 
hoc tests indicated lower accuracy rates for DD 
than TA children (Bonferroni adjusted p = .034, 
Hedge’s g = 0.895) lower accuracy rates for DD 
than LA (Bonferroni adjusted p = .017, Hedge’s 
g = 0.893) and no difference between LA and TA 
groups (Bonferroni adjusted p= .500, Hedge’s 


Table Al 
Descriptive Statistics for Experimental and Standardized Measures 


10th Percentile cutoff 
sample (n = 222, 116 


M 
Age (years), pre-K 5:1 
Age (years), sixth grade 12.0 
Nonsymbolic comparison (accuracy, %) 75.5 
Backward Corsi (max span)* 5.1 
Hearts and flowers (accuracy, %)* 76.4 
Letter-word identification—WCJ-III (K, percentile rank) 111.8 
Math achievement—WC]J-III (pre-K, percentile rank) 51.3 
Math achievement—WC]J-III (K, percentile rank) 52.1 
Math achievement—WC]J-III (first grade, percentile rank) 48.1 
Math achievement—KM- 3 (fifth grade, percentile rank) 39.2 
Math achievement—KM-3 (sixth grade, percentile rank) 42.1 
Math achievement—KM-3 (seventh grade, percentile 42.6 


rank) 


Note. WCJ-III = Woodcock Johnson—III; KM-3 = KeyMath-3. 
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6.7th Percentile cutoff 


sample (n = 221,115 _—_ Entire sample (n = 448, 


female) female) 250 female) 

SD Range M SD Range M SD Range 
0.3 4.5-6.4 5.1 0.3  4.5-6.4 

0.3 114-134 120 03 11413.4 12.0 32 11.4-13.4 
5.29 58.6-91.4 75.6 5.3 58.6-914 74.8 5.48 48.6-91.4 
1.2 2-8 B21 2-8 4.81 1.22 2-8 

14.4 40-100 768 13.9 44100 734 14.5 35-100 
14.1 73-144 111.8 142 73-144 109.7 12.7 73-144 
24.9 1.0-95.0 52.4 23.8 1.0-95.0 

24.7 0.0-93.0 52.7 23.8  0.0-93.0 

24.6 0.4-95.5 48.6 241 0.4-95.5 

23:5 0.5-96.2 39.6 23.1  0.7-96.2 

22.7 0.5-92.5 424 223 10-925 27.0 23.1 0.5-92.5 
22.9 0.5-94.1 42.9 22.5 0.5-94.1 


“Raw scores reported here for year available. See sections 2.4.2 and 2.4.3 for a detailed description of scores used for analyses. 
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Figure Al. Nonsymbolic number comparison accuracy rates for the sample with developmental dyscalculia (DD) defined as achieve- 
ment below the 10th percentile (left) and 6.7th percentile (right) split by congruency. LA = low achieving. TA = typically achieving. 
Error bars represent standard errors. p-Values are indicated for differences in accuracy between congruent and incongruent trials 


(***p < .001). 


g = 0.112). These results 
observed in the ANOVA. 
The same ANOVA’s and ANCOVAs were con- 
ducted on groups formed with the 6.7th percentile 
cutoff threshold for both congruent and incongru- 
ent trials and results fit the same pattern as those of 
the 10th percentile cutoff. In sum, all ANOVA’s 
and ANCOVAs conducted on both the 10th and 
6.7th percentile cut-off samples show the same pat- 
tern of results whereby: (a) no group differences are 
observed for congruent trials of the nonsymbolic 
comparison task, (b) the DD group performs signifi- 
cantly below LA and TA groups on incongruent 


replicate the pattern 


trials even when controlling for other cognitive fac- 
tors and early reading achievement, and (c) no 
group differences are present between LA and TA 
groups on incongruent trials. 


Appendix B 


Details of Missing Sixth-Grade Corsi Span Scores 


At the sixth-grade assessment, 22 children (of 


n = 448) did not proceed from instruction in the 
backward Corsi to successful completion of a trial, 
indicating noncompliance with the task or a failure 
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to understand instructions. Scores on outcome mea- 
sures and covariates of interest for these children 
were different, on average, from those children 
who successfully completed the task (nonsymbolic 
accuracy t(446) = 3.728, p < .001, Cohen's d = 0.794; 
Hearts and Flowers mean accuracy t(446) = 3.508, 
p < .001, Cohen’s d = 0.716; sixth-grade mathemat- 
ics achievement 1t(446) = 2.587, p = .010, Cohen’s 
d = 0.613). Therefore, to avoid nonrandom missing 
data and include these children in our analyses, 
backward Corsi max span from the fifth grade was 
used, where available. To maintain the relative 
position of children’s scores in the fifth grade 
among other children’s sixth-grade scores (fifth- 
grade mean max span = 4.52, sixth-grade mean 
span = 4.88), both years of backward Corsi max 
spans were z-scored and fifth-grade z-scores of the 
22 children were used instead of sixth-grade z- 
scores, which were used for the other 426 children. 


Appendix C 
Exploration of the Model Fit 


In order to better interpret the nonlinear relation 
between accuracy on incongruent trials of the non- 
symbolic number comparison task and mathematics 
achievement, we plot this relation in Figure 3. This 
figure shows the fitted relation between untrans- 
formed sixth-grade mathematics achievement and 
nonsymbolic number comparison accuracy on 
incongruent trials for Model M6, holding Hearts 
and Flowers accuracy, backward Corsi span, early 
reading achievement, and age at testing in sixth 
grade at their sample means. As Figure 3 shows, 
the magnitude of the relation between accuracy on 
incongruent trials and mathematics achievement is 
greater for students with higher accuracy, on aver- 
age. For example, the estimated difference between 
students with 30% and 40% accuracy on nonsym- 
bolic number comparison is associated with a dif- 
ference of 1.0 percentile rank points in sixth-grade 
mathematics achievement, on average. The differ- 
ence between students with 75% and 85% accuracy 
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Figure B1. Predicted sixth-grade mathematics achievement as a 
function of accuracy on incongruent trials of nonsymbolic num- 
ber comparison, for students with average domain-general execu- 
tive functioning and early reading achievement, and of average 
age at testing in sixth grade. 


on nonsymbolic number comparison is associated 
with a difference of 1.3 percentile rank points in 
sixth-grade mathematics achievement, on average. 
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